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Abstract — Linear Programming (LP) decoding is emerging as 
an attractive alternative to decode Low-Density Parity-Check 
(LDPC) codes. However, the earliest LP decoders proposed for 
binary and nonbinary LDPC codes are not suitable for use at 
moderate and large code lengths. To overcome this problem, 
Vontobel et al. developed an iterative Low-Complexity LP (LCLP) 
decoding algorithm for binary LDPC codes. The variable and 
check node calculations of binary LCLP decoding algorithm 
are related to those of binary Belief Propagation (BP). The 
present authors generalized this work to derive an iterative 
LCLP decoding algorithm for nonbinary linear codes. Contrary 
to binary LCLP, the variable and check node calculations of 
this algorithm are in general different from that of nonbinary 
BP. The overall complexity of nonbinary LCLP decoding is 
linear in block length; however the complexity of its check node 
calculations is exponential in the check node degree. In this 
paper, we propose a modified BCJR algorithm for efficient check 
node processing in the nonbinary LCLP decoding algorithm. 
The proposed algorithm has complexity linear in the check node 
degree. We also introduce an alternative state metric to improve 
the run time of the proposed algorithm. Simulation results are 
presented for (504, 252) and (1008, 504) nonbinary LDPC codes 
over Z4. 

I. Introduction 

Binary and nonbinary LDPC codes [ 1 1 have attracted much 
attention in the research community in the past decade. LDPC 
codes are generally decoded by the iterative BP algorithm 
which performs remarkably well at moderate SNR levels. Due 
to their capacity achieving performance, LDPC codes are used 
in many current communications systems. They are also a 
promising candidate for future high data rate communication 
systems as well as for memory applications. However, BP 
suffers from a so called error floor problem at high SNR. 
Also, the heuristic nature of BP makes it difficult to analyze, 
and simulations are too time consuming for the prediction of 
the error floor. 

In recent years, the new approach of LP decoding is 
emerging as an attractive alternative to the BP decoding. LP 
decoding for binary LDPC codes was proposed by Feldman 
et al. Q. In LP decoding, the maximum likelihood decoding 
problem is modeled as an LP problem. In contrast to BP 
decoding, LP decoding relies on a well studied branch of 
mathematics which provides a basis for better understanding 
of the decoding algorithms. The work of (01 extended the LP 
decoding framework of Feldman et al. to nonbinary linear 
codes. Binary and nonbinary LP decoding algorithms rely 
on standard LP solvers based on simplex or interior point 



methods. However, the time complexity of these solvers is 
known to be exponential in number of variables, which limits 
the use of LP decoding to codes of small block length. To 
decode longer codes, a specialized low complexity LP decod- 
ing algorithm is necessary. Such a low-complexity algorithm 
for binary LDPC codes was proposed by Vontobel et al. in 
Q. The present authors, in (5), extended the binary LCLP 
decoding algorithm to nonbinary codes. The complexity 
of the proposed nonbinary LCLP decoding algorithm is linear 
in the block length. As opposed to binary LCLP decoding, 
nonbinary LCLP decoding is not directly related to nonbinary 
BP. Due to this, the complexity of the check node calculations 
of nonbinary LCLP decoding is exponential in the maximum 
check node degree. In this paper, we propose a modified BCJR 
algorithm for the check node processing of nonbinary LCLP 
decoding. The proposed algorithm has complexity linear in 
the check node degree and allows for efficient implementation 
of nonbinary LCLP decoding. We also propose an alternative 
state metric which can be used for faster check node process- 
ing. 

This paper is organized as follows. Notation and background 
information is given in Section II. Section III reviews the non- 
binary LCLP decoding algorithm from [5 1. Section IV contains 
the modified BCJR algorithm for check node processing and 
also explains the alternative state metric. Section V presents 
the simulation results, and Section VI concludes the paper. 

II. Notation and Background 

Let 5ft be a finite ring with q elements with as its 
additive identity. We define !ft~ = 5ft \ {0}. Let C be 
a linear code of length n over the ring 5ft, defined by 
C = {c £ 5ft™ : cH T = 0}, where H is a m x n parity-check 
matrix with entries from 5ft. R{C) = log g (|C|)/n is the rate 
of code C. Hence, the code C is an [n, log 9 (|C|)] linear code 
over 5ft. The row indices and column indices of H are denoted 
by the sets J = {1, . . . , m} and X = {1, . . . , n} respectively. 
The 7-th row of TL is denoted by Hj and the i-th column of "H. 
is denoted by W 1 . supp(c) denotes the support of the vector 
c. For each j £ J , let Ij = supp(Hj) and for each i £ I, 
let Ji = supp(7{ 4 ). Also let dj = \Xj 
We define the set £ = £ 1 x J 



Xj\ and d — maxj^j{dj}. 

j £ J,i £ Ij} = 
elx J : i £ X, j £ Ji}. Moreover for each j £ J 



we define the local Single Parity Check (SPC) code 



X >>■ ■ = 



For each i G X, Ai C sRl{°} ujri l denotes the repetition code 
of the appropriate length and indexing. We also use variables 
u hi = ( u !j)oe!R- and v jA = (t;]"')^- for all i G 2, 
j G JiU{0}; also for j e I, Uj = (i*i 1 j)je.7iU{0} an d similarly 
for j G J, vj = (v jti )i eXj . 

We use the following mapping given in [4|, 



by 



£(a) = x = (a;^) 



such that, for each p 6 3? 

,,,, = j 1, if p = a 



0, otherwise 
We extend this mapping to define 



S: U $ f 4 U {0, 1} (9-I)t C U R {q - 1)t 



where, 

5(c) = (£(<*),...,£(*)), Vce» f 1 tez+. 

For k e 1, k > 0, we define the function ip(x) — e KX and 
its inverse ip~ 1 (x) = ^ log(x). We also use the soft-minimum 
operator introduced in [0J. For any k G K, k > 0, the soft- 
minimum operator is defined as 



mm 
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M*4 iog (^> KZ! ) 




where min; ^{z;} < min;{z;} with equality attained in the 
limit as k — ^ oo, 

We assume transmission over a g-ary input memory- 
less channel and also assume a corrupted codeword y = 
(yij2/2,'"' iVn) G S™ has been received. Here, the channel 
output symbols are denoted by E. Based on this, we define a 
vector A = (\^) a esi- where, for each y G S, a G 

* W -*(3B)- 

Here p(j/|c) denotes the channel output probability (density) 
conditioned on the channel input. 

III. Low Complexity LP Decoding of Nonbinary 
Linear Codes 

To develop a low complexity LP solver for nonbinary 
linear codes, the present authors in [5 1 proposed a primal LP 
formulation which is equivalent to the original LP formulation 
proposed in flU. This primal LP formulation has an advantage 
that, it has one-to-one corresponding with the Fomey-style 
factor graph of the code and can be used to derive a suitable 
dual LP (see section IV in 0). The dual LP is then "softened" 



by using the "soft-min" operator which is used to derive the 
update equations given in Lemma 6.1 in [5 1. The softened dual 
LP is given below. 



SDNBLPD: 
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(<G2). 

The update equation can be used to update the dual variable 
u^f related to an edge (i, j) G £ while all other edge variables 



are held constant. The updated value of the uf^ is given by 

(a) 1 



where, 



U i.j ~ 2 (^' a _ ^: Q ) _ _ 



V^-min 3(a)), 
Vi, Q - ~ min (k) (-«,-, 3(a)) , 

aeAi 

aj=a 

C j>& 4 - min <"> 3(b)), 



C ilQ 4 - min (")(-«,-, 5(b)). 



Here the vector Uj is the vectors where the subvector 
Mi j is excluded. Similarly vector Vj is obtained by excluding 
the subvector Vji from Vj. Vector a is same as a where 
the j-th position is omitted and vector b is obtained by 
excluding the i-th position from b. Now by updating all 
the edges G £ with some schedule (e.g. circular), 

the low-complexity LP decoding algorithm converges to the 
maximum of the SDNBLPD. (see Lemma 6.2 in 0). The 
overall complexity of this algorithm is linear in the block 
length. 

The terms (Vi. a — Vi <a ) and (Cj t s — Cj,a) are related to 
the variable node (VN) i e Z and check node (CN) j G J 
respectively. In the binary case, these terms can be efficiently 
calculated with the VN and CN calculations of the binary Sum- 
Product (SP) algorithm respectively ||3]. However, for nonbi- 
nary codes, the calculation of (V^, a — Vi jCt ) and {Cj :& — Cj.a) 
is not related to the VN and CN calculations of the nonbinary 
SP algorithm [5|. Hence, the CN calculations are earned out 
by processing exhaustively all of the possible codewords of 
the SPC code Cj. Consequently, the complexity of calculating 
(Cj,a — Cj.a) (i.e of CN calculation) is in exponential in the 
maximum check-node degree d. 



IV. Modified BCJR algorithm for Check Node 

CALCULATION OF THE LOW COMPLEXITY LP DECODING 

In 0, the authors suggested that the equations for Cj „ and 



C 7 a can be rewritten as follows: 



= £>(<*i,H(&)> 
beCj 



(1) 

(2) 



It may be observed from the above equations that the 
calculation of the C j a and Cj >a is in the form of the 
marginalization of a product of functions. Hence it is possible 
to compute Cj t a and Cj. a with the help of a trellis based 
variant of the SP algorithm (i.e. BCJR-type algorithm). One 
possibility is to use the trellis of the binary nonlinear code 
C^ L — {S(6) : V6 G Cj}. However, due to nonlinear nature 
of this binary code, the state complexity at the center of its 
trellis would be exponential in dj. Here state merging is also 
not possible. Hence there is no complexity advantage when 
we use the trellis of the binary nonlinear code C^ L . 

However if the trellis for the nonbinary SPC code Cj is 
used, then the state complexity at each trellis step is 0(q) and 
is independent of dj. The branch complexity of this trellis 
is 0(q 2 ). In the following, we prove that the marginals Gj :& 
and Cj. a can be efficiently calculated with some modifications 
to the BCJR algorithm which uses the trellis of the nonbinary 
code Cj. For this purpose we define the following for the trellis 
of the SPC code Cf 

1) The set of all states at time t, S t ,t G (0, • • • , dj) 

2) (s,s') G (St,St+i) represents a branch in the trellis 
which is related to the symbol bt = s' — s. 

3) Since we have trellis for SPC code, each state s G St 
represents the sum of all symbols from bo to bt-%. 

4) We define 

r=j 

= E &r ' Cj. 

r—i 

5) Branch metric for each (s, s r ) G (<St, *5t+i) * s 9( s i s ') — 
g(b t ) = i> ((Vj, t , ah)))- 

6) State metric for forward recursion, 



Mi( s ) = E II 5 ^*)' seSi,iGlj (3) 

(6 ,-,i>i-i)*=0 

<r(0,i-l)=s 

with ^o(O) = 1, /i (a) =0,V«eS". 



and state metric for backward recursion, 

dj—l 

"i(s) = E II 9(bt), sGS h i€ lj (4) 

<T(i,dj-l) = S 

with v dj (0) = 1, v d] (a)=0,Vaer. 



Lemma 4.1: Cj. a and Cj. a can be efficiently computed on 
the trellis of the nonbinary code Cj as follows, 



E (M(s) ■ Vi+i(s') ■ g{s - s) (5) 

(6) 



(s,s')e(5i,5 i+ i) 



E • v i+ i(s') 

(s,s')e(s«,s*+i) 



where state metrics /Xj and are calculated recursively from 
previous state metrics via 

Mi( s ) = X] Mi-i( s - -9(h-i) , 

Proof: First we prove that the state metrics can be 
computed recursively. The following may be observed from 
the definition of /Lt,(s), 



e n^) 

(6 ,-,6i-i)*=0 

<T(0,i-l) = S 



T(0,i-2) + 6 i _ 1 =s 



E 

6»-i68 



\ 



e n^o 



(6o,-A-2) *=0 

r(0,i-3)=i>-b ( _l 



■g(h-i) 



J 



= E Mi-l( s - & i-l)-fl'(&i-l) 

Hence /Ltj(s) can be calculated recursively from the previous 
state metrics. Similarly, we can prove that the Vi(s) can be 
calculated from previous state metrics. 

Now we prove the other part of the lemma. For ease of 
exposition we assume Xj = {0, • • • , dj — 1} in the following. 



bee 3 

Ui-i 

= E ^ E (fo.* . 

bee, \ t=o 

= e n . 



bec 3 \ *=o 



'di-1 



^(W) = e n 



(7) 



bo 61 b2 63 *o &i &2 63 




Fig. 1, States connected by dotted branches are used for the calculation 
of the Cj 1. 

The right-hand side of (0 is, 

Vi(s) ■ Vi+i(s') ■ g(s' - s) 



(s,s')e(5i,Si+i) 



E 



(s,s')e(S I ,5 i+ i) 



e n.^) 
\ 



(6o,-,6i-i)*=0 

\ <x(0,i-l) = s 



e n ^ 



(fci+i.-Arf-i) i=i+l 
\ CT (i+i,d i -i)= s ' 



J 



E 



E 



(«,s')6(5i,5 i+ i) (6 ,--' ,6«_i,6 4+ i,— ,6^-1) 

s '- s ^° <r(0,«-l)=8,<r(*+l,43 -!)=«' 



'i— 1 



,t=0 



t=i+l 



/ u i (s)-i/ i+1 (s')-3(s / -s) 

e n 



(s,s')e(s»,s«+i 



(8) 



feeCj \ t=o 



Using (§ in Q we get Equation (|6]) can be proved in a 
similar manner. ■ 
The overall algorithm works in two phases: in the first 
phase, the forward and backward state metrics are calculated 
and stored; in the second phase the marginals Cj >a and 
Cj t a are computed with Lemma 14.11 where the state metrics 
computed in first phase are utilized. It may be observed that 
the aforementioned algorithm is essentially the same as the 
BCJR algorithm except for the second phase where marginals 
are calculated. 

The calculations of the Lemma |4~T1 can be visualized with 
the help of the trellis diagram. Figures[T]and|2]shows the trellis 
for the nonbinary SPC code of length 4 which is defined over 



Fig. 2. States connected by dotted branches are used for the calculation 
of theC^i. 

Z4. bo to &3 represent the symbols, and states are represented 
by s\, where t indicates the symbol after which the state occurs 
and i represents the sum of the symbols from bo to b t -\. 
The dotted branches in Figure Q] represents the transitions 
related to the symbol 61 = 1. The state pairs which are 
connected by these branches are used for the calculation of 
the Cj \. Similarly, the dotted branches in Figure |2] represent 
transitions related to the symbol b\ ^ 1. Here the metrics of 
the corresponding state pairs are used for the calculation for 
the Cjj. 

A. Alternative State Metric for Faster Calculation of Cj^ & 

The forward state metric fi as defined in (O needs to be 
computed for the calculation of Cj a and can be reused for the 
calculation of Cj a . In Q the algorithm needs to go through all 
branches (s, s') G (S^, <->i+i), s' — s ^ a for the calculation of 
Cj g . If the proposed algorithm is implemented in hardware or 
on multicore architectures, then the computation time for Cj & 
can be reduced by parallelizing its calculation. One possibility 
to parallelize calculation of Cj & is to define a new forward 
state metric /1, which can be computed in parallel with fj, in 
the first phase and reduces the calculations required during the 
second phase of the algorithm. For this we define an alternative 
forward state metric as follows, 



E L[#( 6t )' s^Si,i€lj,aeU 



(b ,-,bi 
x(o,i-i)=»,b 



1) *=0 



(9) 



with fj, (s,a) = 0, Vs G So, VagR . 

It should be noted that due to the condition ^ a, 

fti(s,a) cannot be calculated recursively from fti-x; instead 
it is calculated together with fii(s) from ;U.j-i as follows, 

Pi(s,a)= Y A*i-i(s - h) ■ g(bi) 

b;S9?\{«} 

With the help of the alternative forward state metric given in 
equation (0, the expression (0 of Lemma |4~T1 can be rewritten 

as 

1>( c i,a) = E Mi+i(s>)-^+i(s') (10) 
s'e<s i+ i 



The forward state metric p,i(s,a) requires the calculation 
and storage of an additional q — 1 values for each state s G 
Si during the first phase. Hence the storage requirement for 
the calculation of C j a with ( fTOb increases by a factor of q. 
However, all additional state metric values can be calculated in 
parallel with [i which does not effect the run time of the first 
phase of the algorithm. Also, the second phase of the algorithm 
needs to go through only q states instead of q(q — 1) branches, 
hence the overall run time for computing Cj a is reduced with 
the state metric ft,. 

B. Calculation of Marginals with n — » oo 

In Lemma |4~T1 k is assumed to be finite. However, for many 
practical applications we are interested in k — >• oo. According 
to Lemma 6.3 of 0, for k — > oo we again need to calculate 
(Cj,a — Cj,a) to update the corresponding variables. However, 
the marginals Cj, a and Cj,a are nere obtained as the limit of 
equation (ffj) and ([TJ respectively as n —> oo, i.e., 

Cj a = - min{-«j,S(S)), C j & = - min (-•&,-,£(&)) 
J - beCj bee j 

(ID 

Thus Cj >a and Cj a can be obtained by replacing all "product" 
operations with "sum" operations and similarly by replacing 
all "sum" operations with "min" operations in (fJJ and ([T) 
(marginals with finite k). In (O and (fTJ the marginalization is 
performed in the sum-product semiring. However for k — > oo 
the marginalization is performed in the min-sum semiring and 
hence the marginals of (fTTT i can be computed with a trellis 
based variant of the min-sum algorithm. If we redefine the 
branch metric as gibt) = (Vj,i , £(&{)) and replace all "prod- 
uct" operations with "sum" operations and similarly replace 
all "sum" operations with "min" operations in equation (01, 
©, ©, © and ([Tol l then the resulting equations can be used 
on the trellis of the nonbinary SPC code Cj to compute the 
marginals of (fTTT i. This trellis based variant of the min-sum 
algorithm is related to the Viterbi algorithm. 

V. Results 

This section presents simulation results for low complexity 
LP decoding which uses the trellis based check node calcula- 
tions described above. We consider k — > oo for all simulations. 
We use the binary (504, 252) and (1008, 504) MacKay LDPC 
codes, but with parity-check matrix entries taken from Z4 
instead of GF(2). These LDPC codes are (3, 6)-regular codes; 
hence there are 6 nonzero entries in each row of their parity- 
check matrix. We set the second and third nonzero entry in 
each row to 3, and all other nonzero entries are set to 1. 
Furthermore, we assume transmission over the AWGN channel 
where nonbinary symbols are directly mapped to quaternary 
phase-shift keying (QPSK) signals. We simulate up to 100 
frame errors per simulation point. 

The error-correcting performance of the (504, 252) and 
(1008, 504) LDPC code is shown in Figure [3] where the frame 
error rate (FER) of the LCLP decoding algorithm is compared 
with that of the min-sum (MS) algorithm. The MS algorithm 
also uses the trellis of the nonbinary SPC code for check node 
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■ -e- Low-Complexity LP (1008,504) 
-V- Min-Sum (504,252) 

-B- Min-Sum (1008,504) 

I I I 














i 







3 3.5 4 4.5 5 5.5 6 6.5 



Es/NO (dB) 

Fig. 3. Frame Error Rate for (504, 252) and (1008, 504) quaternary LDPC 
code under QPSK modulation. The performance of Low complexity LP 
decoding is compared with that of the min-sum algorithm. 

processing. The maximum number of iterations is set to 64 for 
both decoding algorithms. For the (504, 252) code, the FER 
of low complexity LP decoding is within 0.5 dB from that 
of MS algorithm and for (1008, 504) code, it is within 0.7 
dB. These results are comparable to that of the binary LCLP 
decoding algorithm of Q. Finally, it is important to note that 
these LDPC codes are significantly longer then the quaternary 
(80, 48) LDPC code tested in 0. 

VI. Conclusion 

In this paper, we proposed a modified BCJR algorithm for 
efficient check node processing in the nonbinary LCLP decod- 
ing algorithm. The proposed algorithm has complexity linear 
in the check node degree. We also proposed an alternative 
state metric which can be used to reduce the run time of the 
proposed algorithm. 

VII. ACKNOWLEDGMENTS 

The authors would like to thank P. O. Vontobel for many 
helpful suggestions and comments. This work was supported 
in part by the Claude Shannon Institute, UCD, Ireland. 

References 

[1] M. C. Davey and D. J. C. MacKay, "Low density parity check codes 
over GF(q)," IEEE Communication Letters, vol. 2, no. 6, pp. 165-167, 
June 1998. 

[2] J. Feldman, M. J. Wainwright and D. R. Karger, "Using linear program- 
ming to decode binary linear codes," IEEE Transactions on Information 
Theory, vol. 51, no. 3, pp. 954-972, March 2005. 

[3] P. O. Vontobel and R. Koetter, "Towards low-complexity linear- 
programming decoding," in Proc. of 4th International Conference on 
Turbo Codes and Related Topics, Munich, Germany, April 3-7, 2006. 

[4] M. F. Flanagan, V. Skachek, E. Byrne, and M. Greferath, "Linear- 
Programming Decoding of Nonbinary Linear Codes," IEEE Transactions 
on Information Theory, vol. 55, no. 9, pp. 4134^1154, September 2009. 

[5] M. Punekar and M. F. Flanagan, "Low Complexity LP Decoding of 
Nonbinary Linear Codes," The Forty-Eighth Annual Allerton Conference 
on Communication, Control, and Computing, September 29 - October 
1, 2010. 



